Keep docker & k8s pod annotations while they are needed #5084

exekias · 2017-09-01T23:23:33Z

In some cases pod annotations are neede after the container/pod is deleted, for
instance when filebeat is reading the log behind the container.

This change makes sure we keep metadata after a pod is gone. By storing
access times we ensure that it's available as long as it's being used

fixes #4986

exekias · 2017-09-03T15:15:57Z

@dengqingpei1990 this PR should fix #4986, just in case you want to give this a try before we merge it. From my local tests, it seems to work nice

vjsamuel

Few corner scenarios are breaking with this change. One, I have left comment on the method itself. Other is as follows:

Take this scenario. User deploys under namespace 1 a stateful set a pod that will come up as zk-1, zk-2, zk-3.

In metricbeat, we use IpPort index and PodName indexes. If a pod in a stateful set get restarted/recreated all together, they are going to come up with the same name and in most scenarios the same ip port combination.

When we look up pods in this change, we look at the deleted cache first and then goto the existing pod workloads. It should ideally be the other way around. If a stateful set pod is updated with new set of labels, the index keys will remain the same but the metadata would have changed.

Both scenarios should ideally be fixable by checking live pod metadata first and returning from that and then check dead pods.

vjsamuel · 2017-09-03T21:41:20Z

libbeat/processors/add_kubernetes_metadata/podwatcher.go

+	p.annotationCache.RLock()
+	meta, ok := p.annotationCache.annotations[arg]
+	var deleted bool
+	if ok {


For logs, we can update the pod labels using kubectl label po. The container name remains the same and the update will call onPodDelete && onPodAdd. So now the living container's metadata is in the deleted map. This portion of code will end up returning old metadata.

Ideally we should first check metadata from live pods and return data for that.

I've added a commit addressing this, basically, if an add method is called it will remove any deleted flag for all its indexes

exekias · 2017-09-04T08:48:33Z

good point, I've addressed the issue 👍

hofler · 2017-09-04T08:56:41Z

libbeat/processors/add_docker_metadata/watcher.go

 					w.containers[event.Actor.ID] = &Container{
 						ID:     event.Actor.ID,
 						Name:   name,
 						Image:  image,
 						Labels: event.Actor.Attributes,
 					}
+					w.Unlock()
 				}

 				// Delete
 				if event.Action == "die" || event.Action == "kill" {


If a container dies and gets started again it will still time out eventually and lose the metadata.

I think that case is treated, as onPodAdd (also called in case of update) cleans any deleted flag: https://github.com/elastic/beats/pull/5084/files#diff-c211c9481fb346b82d8713e779e0be9fR141

Sorry you are right, I was talking about the Kubernetes watcher, not Docker, will apply the same fix, thanks!

I'm also working on tests

pushed b54187e

When I restart a container, the events are: kill -> die -> stop -> start -> restart. So the container will not be un-deleted in this case.

Looking at it, adding the metadata on "created" and removing on "die" does seem inconsistent to me.

Let me do some tests, looks like it would be better to add it on start, isn't it?

I think adding on start is the better approach.

vjsamuel

LGTM

exekias · 2017-09-06T11:03:56Z

jenkins retest this

ruflin

Code LGTM. I think this needs an addition to the config file reference and potentially also docs?

ruflin · 2017-09-06T14:54:57Z

libbeat/processors/add_docker_metadata/watcher.go

 	}, nil
 }

 // Container returns the running container with the given ID or nil if unknown
 func (w *watcher) Container(ID string) *Container {
-	return w.containers[ID]
+	w.RLock()
+	container := w.containers[ID]


I'm aware the code was here before, but should we check here if ok if the ID actually exists?

We get nil if the container is not there, and we treat that case

ruflin · 2017-09-06T14:58:09Z

libbeat/processors/add_docker_metadata/watcher.go

+			var toDelete []string
+			timeout := time.Now().Add(-w.cleanupTimeout)
+			w.RLock()
+			for key, lastSeen := range w.deleted {


should we add some debug logging to this? Otherwise very hard to figure out what is still in memory and what was removed.

ruflin · 2017-09-06T14:59:56Z

libbeat/processors/add_kubernetes_metadata/podwatcher.go

+			// Check entries for timeout
+			var toDelete []string
+			timeout := time.Now().Add(-p.cleanupTimeout)
+			p.annotationCache.RLock()


Not for this PR, but the code looks so similar to the other cleanupWorker, I'm thinking if parts could be shared.

ruflin

LGTM. Could you squash the commits?

In some cases pod annotations are neede after the pod is deleted, for instance when filebeat is reading the log behind the container. This change makes sure we keep metadata after a pod is gone. By storing access times we ensure that it's available as long as it's being used

exekias · 2017-09-07T09:53:33Z

done, thanks for the review everyone!

In some cases pod annotations are neede after the pod is deleted, for instance when filebeat is reading the log behind the container. This change makes sure we keep metadata after a pod is gone. By storing access times we ensure that it's available as long as it's being used (cherry picked from commit a49463c)

elastic#5133) In some cases pod annotations are neede after the pod is deleted, for instance when filebeat is reading the log behind the container. This change makes sure we keep metadata after a pod is gone. By storing access times we ensure that it's available as long as it's being used (cherry picked from commit d24a88a)

exekias added bug in progress Pull request is currently in progress. libbeat review labels Sep 1, 2017

exekias changed the title ~~Keep pod annotations while they are needed~~ Keep docker & k8s pod annotations while they are needed Sep 3, 2017

vjsamuel suggested changes Sep 3, 2017

View reviewed changes

hofler reviewed Sep 4, 2017

View reviewed changes

exekias force-pushed the fix-k8s-log-ending-lines branch from 162f58c to b54187e Compare September 4, 2017 09:06

exekias added needs_backport PR is waiting to be backported to other branches. v6.0.0-rc1 labels Sep 4, 2017

exekias force-pushed the fix-k8s-log-ending-lines branch 5 times, most recently from 1ea2445 to cc7260e Compare September 4, 2017 14:14

exekias removed the in progress Pull request is currently in progress. label Sep 4, 2017

vjsamuel approved these changes Sep 4, 2017

View reviewed changes

ruflin reviewed Sep 6, 2017

View reviewed changes

ruflin approved these changes Sep 7, 2017

View reviewed changes

exekias force-pushed the fix-k8s-log-ending-lines branch from 6357cec to 7d3314e Compare September 7, 2017 09:33

ruflin merged commit a49463c into elastic:master Sep 7, 2017

exekias mentioned this pull request Sep 8, 2017

Cherry-pick #5084 to 6.0: Keep docker & k8s pod annotations while they are needed #5133

Merged

exekias removed the needs_backport PR is waiting to be backported to other branches. label Sep 8, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Keep docker & k8s pod annotations while they are needed #5084

Keep docker & k8s pod annotations while they are needed #5084

exekias commented Sep 1, 2017 •

edited

Loading

exekias commented Sep 3, 2017

vjsamuel left a comment

vjsamuel Sep 3, 2017

vjsamuel Sep 3, 2017

exekias Sep 4, 2017

exekias commented Sep 4, 2017

hofler Sep 4, 2017

exekias Sep 4, 2017

exekias Sep 4, 2017 •

edited

Loading

exekias Sep 4, 2017

hofler Sep 4, 2017

exekias Sep 4, 2017

hofler Sep 4, 2017

vjsamuel left a comment

exekias commented Sep 6, 2017

ruflin left a comment

ruflin Sep 6, 2017

exekias Sep 6, 2017

ruflin Sep 6, 2017

ruflin Sep 6, 2017

ruflin left a comment

exekias commented Sep 7, 2017

Keep docker & k8s pod annotations while they are needed #5084

Keep docker & k8s pod annotations while they are needed #5084

Conversation

exekias commented Sep 1, 2017 • edited Loading

exekias commented Sep 3, 2017

vjsamuel left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

exekias commented Sep 4, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

exekias Sep 4, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vjsamuel left a comment

Choose a reason for hiding this comment

exekias commented Sep 6, 2017

ruflin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ruflin left a comment

Choose a reason for hiding this comment

exekias commented Sep 7, 2017

exekias commented Sep 1, 2017 •

edited

Loading

exekias Sep 4, 2017 •

edited

Loading